Exploring High-dimensional Data with Robust Principal Components

نویسنده

  • P. Filzmoser
چکیده

For high-dimensional data of low sample size it is difficult to compute principal components in a robust way. We mention an algorithm which is highly precise and fast to compute. The robust principal components are used to compute distances of the observations in the (sub-)space of the principal components and distances to this (sub-)space. Both distance measures retain valuable information about the multivariate data structure. Plotting the magnitudes of the distance measures helps to reveal important multivariate data information.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust high-dimensional semiparametric regression using optimized differencing method applied to the vitamin B2 production data

Background and purpose: By evolving science, knowledge, and technology, we deal with high-dimensional data in which the number of predictors may considerably exceed the sample size. The main problems with high-dimensional data are the estimation of the coefficients and interpretation. For high-dimension problems, classical methods are not reliable because of a large number of predictor variable...

متن کامل

Robust Principal Component Analysis and Fractal Methods to Delineate Mineralization-Related Hydrothermally-Altered Zones from ASTER Data: A Case Study of Dehaj Terrain, Central Iran

The Dehaj area, located in the southern part of the Urumieh-Dokhtar magmatic belt, is a well-endowed terrain hosting a number of world-class porphyry copper deposits. These deposits are all hosted in an acidic to intermediate volcano-plutonic sequence greatly affected by various types of the hydrothermal alterations, whether argillic, phyllic or propylitic. Although there are a handful of hithe...

متن کامل

Robust classification in high dimensions based on the SIMCA Method

In this paper we first investigate the robustness of the SIMCA method for classifying high-dimensional observations. It turns out that both stages of the algorithm, the estimation of principal components and the construction of a classification rule, can be highly disturbed by the presence of outliers. Therefore we propose a robust procedure RSIMCA which is based on a robust Principal Component...

متن کامل

Persian Handwriting Analysis Using Functional Principal Components

Principal components analysis is a well-known statistical method in dealing with large dependent data sets. It is also used in functional data for both purposes of data reduction as well as variation representation. On the other hand "handwriting" is one of the objects, studied in various statistical fields like pattern recognition and shape analysis. Considering time as the argument,...

متن کامل

Robust PCA and classification in biosciences

MOTIVATION Principal components analysis (PCA) is a very popular dimension reduction technique that is widely used as a first step in the analysis of high-dimensional microarray data. However, the classical approach that is based on the mean and the sample covariance matrix of the data is very sensitive to outliers. Also, classification methods based on this covariance matrix do not give good r...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007